A short description of the post.
Many of the Abila, Kronos-based employees of GAStech have company cars which are approved for both personal and business use. Those who do not have company cars have the ability to check out company trucks for business use, but these trucks cannot be used for personal business.
Employees with company cars are happy to have these vehicles, because the company cars are generally much higher quality than the cars they would be able to afford otherwise. However, GAStech does not trust their employees. Without the employees? knowledge, GAStech has installed geospatial tracking software in the company vehicles. The vehicles are tracked periodically as long as they are moving.
This vehicle tracking data has been made available to law enforcement to support their investigation. Unfortunately, data is not available for the day the GAStech employees went missing. Data is only available for the two weeks prior to the disappearance.
To promote local businesses, Kronos based companies provide a Kronos Kares benefit card to GASTech employees giving them discounts and rewards in exchange for collecting information about their credit card purchases and preferences as recorded on loyalty cards. This data has been made available to investigators in the hopes that it can help resolve the situation. However, Kronos Kares does not collect personal information beyond purchases.
In this question, we need to find out the popular locations. Dataset cc_data and loyalty_data will be used in this question.
Firstly, a loop structure is created to library all the packages needed.
packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'clock', 'tidyverse','dplyr', 'tidyr','raster','sf','sp','tmap', 'gifski', 'writexl', 'mapview', "ggplot2", 'dplyr')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
Load the credit card dataset in the environment. We can also use the function “glimpse” to check the data type of the variables.
credit_card <- read.csv("data_MC2/cc_data.csv")
glimpse(credit_card)
Rows: 1,490
Columns: 4
$ timestamp <chr> "1/6/2014 7:28", "1/6/2014 7:34", "1/6/2014 7:35"~
$ location <chr> "Brew've Been Served", "Hallowed Grounds", "Brew'~
$ price <dbl> 11.34, 52.22, 8.33, 16.72, 4.24, 4.17, 28.73, 9.6~
$ last4ccnum <int> 4795, 7108, 6816, 9617, 7384, 5368, 7253, 4948, 9~
Load the loyalty card dataset in the environment.
loyalty_card <- read.csv("data_MC2/loyalty_data.csv")
glimpse(loyalty_card)
Rows: 1,392
Columns: 4
$ timestamp <chr> "1/6/2014", "1/6/2014", "1/6/2014", "1/6/2014", "~
$ location <chr> "Brew've Been Served", "Brew've Been Served", "Ha~
$ price <dbl> 4.17, 9.60, 16.53, 11.51, 12.93, 4.27, 11.20, 15.~
$ loyaltynum <chr> "L2247", "L9406", "L8328", "L6417", "L1107", "L40~
Change the datatype of variable timestamp from character to date-time format.
credit_card$timestamp <- date_time_parse(credit_card$timestamp,
zone = "",
format = "%m/%d/%Y %H:%M")
glimpse(credit_card)
Rows: 1,490
Columns: 4
$ timestamp <dttm> 2014-01-06 07:28:00, 2014-01-06 07:34:00, 2014-0~
$ location <chr> "Brew've Been Served", "Hallowed Grounds", "Brew'~
$ price <dbl> 11.34, 52.22, 8.33, 16.72, 4.24, 4.17, 28.73, 9.6~
$ last4ccnum <int> 4795, 7108, 6816, 9617, 7384, 5368, 7253, 4948, 9~
Change the datatype of variable timestamp from character to date format.
loyalty_card$timestamp <- date_time_parse(loyalty_card$timestamp,
zone = "",
format = "%m/%d/%Y")
glimpse(loyalty_card)
Rows: 1,392
Columns: 4
$ timestamp <dttm> 2014-01-06, 2014-01-06, 2014-01-06, 2014-01-06, ~
$ location <chr> "Brew've Been Served", "Brew've Been Served", "Ha~
$ price <dbl> 4.17, 9.60, 16.53, 11.51, 12.93, 4.27, 11.20, 15.~
$ loyaltynum <chr> "L2247", "L9406", "L8328", "L6417", "L1107", "L40~
In order to explore the further corresponding relationship between credit card and loyalty card, we need to join the two datatables together. However, considering that there is not a common column for joining, a new column “Date” is needed to be extracted from the variable timestamp in credit_card.csv.
credit_card$Date <- format(credit_card$timestamp, format="%Y-%m-%d")
credit_card$Date <- date_time_parse(credit_card$Date,
zone = "",
format = "%Y-%m-%d")
glimpse(credit_card)
Rows: 1,490
Columns: 5
$ timestamp <dttm> 2014-01-06 07:28:00, 2014-01-06 07:34:00, 2014-0~
$ location <chr> "Brew've Been Served", "Hallowed Grounds", "Brew'~
$ price <dbl> 11.34, 52.22, 8.33, 16.72, 4.24, 4.17, 28.73, 9.6~
$ last4ccnum <int> 4795, 7108, 6816, 9617, 7384, 5368, 7253, 4948, 9~
$ Date <dttm> 2014-01-06, 2014-01-06, 2014-01-06, 2014-01-06, ~
Then, we can use variable “Date”, “location”, and “price” to finish the full join of the two tables. According to the regulation of Gas Tech, the records of loyalty card are usually along with the records of credit card. Therefore, we can find out the combination of credit card and loyalty card after joining.
card_joined <- credit_card %>%
full_join(loyalty_card, by = c("Date" = "timestamp", "location", "price"))
In order to find out the popular locations, we need to the calculate the number of records of locations and then choose those with more records. Popular locations are explored for credit cards and loyalty cards separately.
popular_credit_card <- credit_card %>%
group_by(location) %>%
summarise(count = n()) %>%
arrange(desc(count))
popular_loyalty_card <- loyalty_card %>%
group_by(location) %>%
summarise(count = n()) %>%
arrange(desc(count))
According to the datatable, we will consider the Top6 as the popular locations.
popular_top_credit <- popular_credit_card %>%
gather(location, count) %>%
arrange(desc(count)) %>%
top_n(6)
popular_top_credit
# A tibble: 6 x 2
location count
<chr> <int>
1 Katerina's Cafe 212
2 Hippokampos 171
3 Guy's Gyros 158
4 Brew've Been Served 156
5 Hallowed Grounds 92
6 Ouzeri Elian 87
Then we can draw the bar chart of the popular locations selected.
top6_credit<-ggplot(data=popular_top_credit, aes(x=location, y=count)) +
geom_bar(stat="identity", fill="steelblue")+
theme_minimal()
top6_credit

popular_top_loyalty <- popular_loyalty_card %>%
gather(location, count) %>%
arrange(desc(count)) %>%
top_n(6)
popular_top_loyalty
# A tibble: 6 x 2
location count
<chr> <int>
1 Katerina's Cafe 195
2 Hippokampos 155
3 Guy's Gyros 146
4 Brew've Been Served 140
5 Ouzeri Elian 84
6 Hallowed Grounds 80
top6_loyalty<-ggplot(data=popular_top_loyalty, aes(x=location, y=count)) +
geom_bar(stat="identity", fill="steelblue")+
theme_minimal()
top6_loyalty

Therefore, combining the result of loyalty card and credit card, the popular locations are the 6 locations shown before. And we can also create a new datatable only containing the card information of popular locations.
The peak time in the Katerina’s Cafe is 13pm - 14pm and 19pm - 20pm, which are the lunch time and dinner time respectively. Similarly, Hippokampos, Guy’s Gyros and Ouzeri Elian are all restaurants and they have the same peak hours as Katerina’s Cafe.
popular_locations %>%
filter(location == "Katerina's Cafe")
timestamp location price last4ccnum loyaltynum
1 2014-01-06 12:56:00 Katerina's Cafe 13.68 8332 <NA>
2 2014-01-06 13:28:00 Katerina's Cafe 33.54 8411 L6110
3 2014-01-06 13:46:00 Katerina's Cafe 18.56 6899 L6267
4 2014-01-06 13:50:00 Katerina's Cafe 13.59 1874 L4424
5 2014-01-06 13:50:00 Katerina's Cafe 32.64 9617 L5553
6 2014-01-06 13:53:00 Katerina's Cafe 36.95 2142 <NA>
7 2014-01-06 13:59:00 Katerina's Cafe 39.41 3853 L1485
8 2014-01-06 14:17:00 Katerina's Cafe 19.65 6901 L9363
9 2014-01-06 18:56:00 Katerina's Cafe 78.65 8411 <NA>
10 2014-01-06 19:08:00 Katerina's Cafe 26.46 8129 L8328
11 2014-01-06 20:07:00 Katerina's Cafe 23.63 7253 L1682
12 2014-01-06 20:08:00 Katerina's Cafe 12.32 1874 L4424
13 2014-01-06 20:19:00 Katerina's Cafe 21.66 2418 <NA>
14 2014-01-06 20:29:00 Katerina's Cafe 17.92 9405 L3259
15 2014-01-06 20:30:00 Katerina's Cafe 28.00 6691 L6267
16 2014-01-06 20:33:00 Katerina's Cafe 15.52 5921 L3295
17 2014-01-06 20:59:00 Katerina's Cafe 26.10 3492 L7814
18 2014-01-06 21:03:00 Katerina's Cafe 10.72 9617 <NA>
19 2014-01-06 21:12:00 Katerina's Cafe 26.51 4948 L9406
20 2014-01-07 13:08:00 Katerina's Cafe 59.07 7792 <NA>
21 2014-01-07 13:33:00 Katerina's Cafe 22.27 7889 L6119
22 2014-01-07 13:41:00 Katerina's Cafe 51.41 3484 <NA>
23 2014-01-07 13:57:00 Katerina's Cafe 37.44 8202 L2343
24 2014-01-07 14:07:00 Katerina's Cafe 10.90 4434 L2169
25 2014-01-07 20:09:00 Katerina's Cafe 19.49 7253 L1682
26 2014-01-07 20:11:00 Katerina's Cafe 10.07 2142 L9637
27 2014-01-07 20:12:00 Katerina's Cafe 55.63 1874 <NA>
28 2014-01-07 20:13:00 Katerina's Cafe 94.75 7889 <NA>
29 2014-01-07 20:13:00 Katerina's Cafe 65.02 8202 <NA>
30 2014-01-07 20:18:00 Katerina's Cafe 31.60 8411 L6110
31 2014-01-07 20:32:00 Katerina's Cafe 19.53 5921 L3295
32 2014-01-07 20:33:00 Katerina's Cafe 17.26 9617 L5553
33 2014-01-07 20:33:00 Katerina's Cafe 25.54 6691 L6267
34 2014-01-07 20:47:00 Katerina's Cafe 32.83 3492 L7814
35 2014-01-07 21:01:00 Katerina's Cafe 33.29 9405 L3259
36 2014-01-07 21:19:00 Katerina's Cafe 18.35 1310 L8012
37 2014-01-08 13:19:00 Katerina's Cafe 29.26 7792 L5756
38 2014-01-08 13:28:00 Katerina's Cafe 8.53 6899 L6267
39 2014-01-08 13:34:00 Katerina's Cafe 88.98 1310 <NA>
40 2014-01-08 13:41:00 Katerina's Cafe 26.45 2142 L9637
41 2014-01-08 13:42:00 Katerina's Cafe 57.04 6895 <NA>
42 2014-01-08 13:47:00 Katerina's Cafe 27.05 1877 <NA>
43 2014-01-08 14:06:00 Katerina's Cafe 22.94 7108 <NA>
44 2014-01-08 19:32:00 Katerina's Cafe 34.25 5407 L4034
45 2014-01-08 20:00:00 Katerina's Cafe 13.83 8411 L6110
46 2014-01-08 20:08:00 Katerina's Cafe 31.59 1310 L8012
47 2014-01-08 20:27:00 Katerina's Cafe 33.31 2142 L9637
48 2014-01-08 20:31:00 Katerina's Cafe 36.60 7889 L6119
49 2014-01-08 20:35:00 Katerina's Cafe 16.93 8202 L2343
50 2014-01-08 20:42:00 Katerina's Cafe 92.83 5921 <NA>
51 2014-01-08 21:02:00 Katerina's Cafe 28.86 4948 <NA>
52 2014-01-08 21:09:00 Katerina's Cafe 27.87 1874 L4424
53 2014-01-09 13:03:00 Katerina's Cafe 17.02 7253 L1682
54 2014-01-09 13:23:00 Katerina's Cafe 16.98 7792 L5756
55 2014-01-09 13:27:00 Katerina's Cafe 25.85 8156 L5224
56 2014-01-09 13:35:00 Katerina's Cafe 10.42 3853 L1485
57 2014-01-09 14:06:00 Katerina's Cafe 8.01 1877 L3014
58 2014-01-09 19:25:00 Katerina's Cafe 35.92 8129 L8328
59 2014-01-09 19:30:00 Katerina's Cafe 26.60 5921 L9406
60 2014-01-09 19:30:00 Katerina's Cafe 26.60 5921 L3295
61 2014-01-09 20:06:00 Katerina's Cafe 26.60 4948 L9406
62 2014-01-09 20:06:00 Katerina's Cafe 26.60 4948 L3295
63 2014-01-09 20:08:00 Katerina's Cafe 19.02 3492 L7814
64 2014-01-09 20:09:00 Katerina's Cafe 90.97 6691 <NA>
65 2014-01-09 20:20:00 Katerina's Cafe 91.14 9405 <NA>
66 2014-01-09 20:22:00 Katerina's Cafe 29.82 8202 L2343
67 2014-01-09 20:27:00 Katerina's Cafe 53.78 5407 <NA>
68 2014-01-09 20:38:00 Katerina's Cafe 30.69 2418 <NA>
69 2014-01-09 21:18:00 Katerina's Cafe 8.57 1310 L8012
70 2014-01-10 13:30:00 Katerina's Cafe 34.76 7688 L4164
71 2014-01-10 13:36:00 Katerina's Cafe 25.36 7354 L9254
72 2014-01-10 13:38:00 Katerina's Cafe 21.22 2418 L9018
73 2014-01-10 13:47:00 Katerina's Cafe 25.62 6895 L3366
74 2014-01-10 13:47:00 Katerina's Cafe 24.33 2142 L9637
75 2014-01-10 14:08:00 Katerina's Cafe 77.62 7108 <NA>
76 2014-01-10 19:50:00 Katerina's Cafe 19.57 8129 <NA>
77 2014-01-10 19:56:00 Katerina's Cafe 21.89 5921 L3295
78 2014-01-10 20:08:00 Katerina's Cafe 15.76 9405 L3259
79 2014-01-10 20:09:00 Katerina's Cafe 8.14 8411 L6110
80 2014-01-10 20:42:00 Katerina's Cafe 21.28 5407 L4034
81 2014-01-11 13:19:00 Katerina's Cafe 45.22 8156 L5224
82 2014-01-11 13:23:00 Katerina's Cafe 45.21 2418 L9018
83 2014-01-11 13:32:00 Katerina's Cafe 29.10 6895 L3366
84 2014-01-11 13:43:00 Katerina's Cafe 71.65 3484 L2490
85 2014-01-11 13:50:00 Katerina's Cafe 23.45 1321 L4149
86 2014-01-11 14:17:00 Katerina's Cafe 26.02 2142 L9637
87 2014-01-11 19:08:00 Katerina's Cafe 55.67 9617 L5553
88 2014-01-11 19:31:00 Katerina's Cafe 33.18 2142 L9637
89 2014-01-11 19:32:00 Katerina's Cafe 20.71 6691 L6267
90 2014-01-11 19:40:00 Katerina's Cafe 57.36 2540 L5947
91 2014-01-11 19:50:00 Katerina's Cafe 12.55 7889 L6119
92 2014-01-11 19:51:00 Katerina's Cafe 45.68 1310 L8012
93 2014-01-11 19:52:00 Katerina's Cafe 15.72 8411 <NA>
94 2014-01-11 20:00:00 Katerina's Cafe 25.76 4948 L9406
95 2014-01-11 20:06:00 Katerina's Cafe 52.45 8202 L2343
96 2014-01-11 20:13:00 Katerina's Cafe 9.78 9405 L3259
97 2014-01-11 20:24:00 Katerina's Cafe 94.68 2418 L9018
98 2014-01-11 20:46:00 Katerina's Cafe 57.60 1874 <NA>
99 2014-01-12 19:18:00 Katerina's Cafe 12.64 7889 L6119
100 2014-01-12 19:21:00 Katerina's Cafe 11.81 7253 L1682
101 2014-01-12 19:43:00 Katerina's Cafe 40.94 5407 L4034
102 2014-01-12 19:47:00 Katerina's Cafe 17.36 3492 <NA>
103 2014-01-12 20:04:00 Katerina's Cafe 39.30 8129 L8328
104 2014-01-12 20:11:00 Katerina's Cafe 67.14 3547 L9362
105 2014-01-12 20:12:00 Katerina's Cafe 15.37 9405 L3259
106 2014-01-12 20:13:00 Katerina's Cafe 39.55 2142 <NA>
107 2014-01-12 20:17:00 Katerina's Cafe 16.34 8411 L6110
108 2014-01-12 20:35:00 Katerina's Cafe 40.25 4948 L9406
109 2014-01-12 20:48:00 Katerina's Cafe 87.09 1874 <NA>
110 2014-01-13 13:23:00 Katerina's Cafe 38.98 2418 L9018
111 2014-01-13 13:34:00 Katerina's Cafe 26.15 6899 L6267
112 2014-01-13 13:36:00 Katerina's Cafe 17.21 9405 <NA>
113 2014-01-13 13:40:00 Katerina's Cafe 8.29 1874 L4424
114 2014-01-13 13:48:00 Katerina's Cafe 24.26 3492 L7814
115 2014-01-13 13:52:00 Katerina's Cafe 29.55 3547 L9362
116 2014-01-13 19:50:00 Katerina's Cafe 89.83 3547 <NA>
117 2014-01-13 19:59:00 Katerina's Cafe 30.84 6691 <NA>
118 2014-01-13 20:24:00 Katerina's Cafe 35.70 2142 L9637
119 2014-01-13 20:33:00 Katerina's Cafe 91.36 1310 <NA>
120 2014-01-13 20:44:00 Katerina's Cafe 93.57 5407 <NA>
121 2014-01-13 20:47:00 Katerina's Cafe 22.24 7253 <NA>
122 2014-01-13 20:51:00 Katerina's Cafe 55.82 7889 <NA>
123 2014-01-13 21:00:00 Katerina's Cafe 27.59 1874 L4424
124 2014-01-13 21:16:00 Katerina's Cafe 33.11 9405 L3259
125 2014-01-14 13:14:00 Katerina's Cafe 29.12 8156 L5224
126 2014-01-14 13:21:00 Katerina's Cafe 34.45 7792 <NA>
127 2014-01-14 13:22:00 Katerina's Cafe 10.03 6895 L3366
128 2014-01-14 13:32:00 Katerina's Cafe 20.95 7354 L9254
129 2014-01-14 13:34:00 Katerina's Cafe 29.10 4795 L8566
130 2014-01-14 13:41:00 Katerina's Cafe 75.46 3547 <NA>
131 2014-01-14 13:43:00 Katerina's Cafe 31.69 6691 L6267
132 2014-01-14 13:58:00 Katerina's Cafe 33.54 7819 L5259
133 2014-01-14 20:09:00 Katerina's Cafe 91.80 7889 <NA>
134 2014-01-14 20:15:00 Katerina's Cafe 27.67 8129 <NA>
135 2014-01-14 20:17:00 Katerina's Cafe 36.95 3547 L9362
136 2014-01-14 20:19:00 Katerina's Cafe 31.94 5407 L4034
137 2014-01-14 20:26:00 Katerina's Cafe 12.65 1874 L4424
138 2014-01-14 20:34:00 Katerina's Cafe 11.89 3492 L7814
139 2014-01-14 20:42:00 Katerina's Cafe 46.61 8202 <NA>
140 2014-01-14 20:50:00 Katerina's Cafe 25.40 8411 L6110
141 2014-01-14 21:10:00 Katerina's Cafe 18.35 9617 L5553
142 2014-01-15 13:13:00 Katerina's Cafe 24.15 7792 L5756
143 2014-01-15 13:17:00 Katerina's Cafe 88.56 2418 <NA>
144 2014-01-15 13:27:00 Katerina's Cafe 55.60 1286 L3288
145 2014-01-15 14:00:00 Katerina's Cafe 12.12 1310 L8012
146 2014-01-15 19:50:00 Katerina's Cafe 39.23 3492 L7814
147 2014-01-15 19:59:00 Katerina's Cafe 25.08 8129 <NA>
148 2014-01-15 20:00:00 Katerina's Cafe 29.84 2142 <NA>
149 2014-01-15 20:05:00 Katerina's Cafe 11.45 9617 L5553
150 2014-01-15 20:17:00 Katerina's Cafe 29.33 8411 L6110
151 2014-01-15 20:23:00 Katerina's Cafe 39.63 1310 L8012
152 2014-01-15 20:26:00 Katerina's Cafe 61.61 8202 <NA>
153 2014-01-15 20:29:00 Katerina's Cafe 29.66 2418 L9018
154 2014-01-15 21:05:00 Katerina's Cafe 13.77 7253 L1682
155 2014-01-15 21:21:00 Katerina's Cafe 27.48 3547 L9362
156 2014-01-16 13:17:00 Katerina's Cafe 11.08 9683 L7291
157 2014-01-16 13:19:00 Katerina's Cafe 27.82 7688 L4164
158 2014-01-16 13:23:00 Katerina's Cafe 34.90 7792 L5756
159 2014-01-16 13:27:00 Katerina's Cafe 34.02 3484 L2490
160 2014-01-16 13:27:00 Katerina's Cafe 39.74 5407 L4034
161 2014-01-16 13:43:00 Katerina's Cafe 38.32 5368 L2247
162 2014-01-16 18:49:00 Katerina's Cafe 93.68 8411 <NA>
163 2014-01-16 19:39:00 Katerina's Cafe 10.93 7889 L6119
164 2014-01-16 19:47:00 Katerina's Cafe 34.34 3547 <NA>
165 2014-01-16 20:05:00 Katerina's Cafe 69.30 5407 <NA>
166 2014-01-16 20:09:00 Katerina's Cafe 18.79 8129 L8328
167 2014-01-16 20:23:00 Katerina's Cafe 13.91 9617 L5553
168 2014-01-16 20:25:00 Katerina's Cafe 35.43 2418 L9018
169 2014-01-16 20:28:00 Katerina's Cafe 30.84 9405 <NA>
170 2014-01-16 20:36:00 Katerina's Cafe 19.44 2142 L9637
171 2014-01-16 20:39:00 Katerina's Cafe 30.56 8202 L2343
172 2014-01-16 20:51:00 Katerina's Cafe 34.58 3492 L7814
173 2014-01-17 13:42:00 Katerina's Cafe 21.01 3547 L9362
174 2014-01-17 13:48:00 Katerina's Cafe 13.11 1321 L4149
175 2014-01-17 13:53:00 Katerina's Cafe 35.52 4434 L2169
176 2014-01-17 14:15:00 Katerina's Cafe 70.72 4948 <NA>
177 2014-01-17 19:24:00 Katerina's Cafe 22.18 3492 L7814
178 2014-01-17 19:57:00 Katerina's Cafe 34.93 8129 <NA>
179 2014-01-17 20:07:00 Katerina's Cafe 21.22 1310 L8012
180 2014-01-17 20:19:00 Katerina's Cafe 36.12 8202 L2343
181 2014-01-17 20:20:00 Katerina's Cafe 19.09 7253 L1682
182 2014-01-17 20:24:00 Katerina's Cafe 72.13 7889 <NA>
183 2014-01-17 20:58:00 Katerina's Cafe 38.21 2142 L9637
184 2014-01-17 21:05:00 Katerina's Cafe 33.73 5407 L4034
185 2014-01-17 21:06:00 Katerina's Cafe 27.49 9405 L3259
186 2014-01-18 13:15:00 Katerina's Cafe 34.54 6691 L6267
187 2014-01-18 13:34:00 Katerina's Cafe 19.30 3484 L2490
188 2014-01-18 14:00:00 Katerina's Cafe 37.26 8411 L6110
189 2014-01-18 14:03:00 Katerina's Cafe 96.36 7688 L4164
190 2014-01-18 14:06:00 Katerina's Cafe 24.14 7819 <NA>
191 2014-01-18 19:17:00 Katerina's Cafe 21.63 8129 <NA>
192 2014-01-18 19:30:00 Katerina's Cafe 8.14 7889 L6119
193 2014-01-18 19:31:00 Katerina's Cafe 22.74 1874 L4424
194 2014-01-18 19:32:00 Katerina's Cafe 33.41 4434 L2169
195 2014-01-18 19:36:00 Katerina's Cafe 35.79 6691 <NA>
196 2014-01-18 19:46:00 Katerina's Cafe 11.19 8202 <NA>
197 2014-01-18 19:47:00 Katerina's Cafe 16.48 4948 <NA>
198 2014-01-18 19:53:00 Katerina's Cafe 76.10 3547 L9362
199 2014-01-18 19:54:00 Katerina's Cafe 33.63 1310 L8012
200 2014-01-18 19:56:00 Katerina's Cafe 32.68 2418 L9018
201 2014-01-18 20:01:00 Katerina's Cafe 17.37 9405 <NA>
202 2014-01-18 20:04:00 Katerina's Cafe 45.88 8411 L6110
203 2014-01-18 20:09:00 Katerina's Cafe 19.31 7354 L9254
204 2014-01-18 20:13:00 Katerina's Cafe 9.72 3853 L1485
205 2014-01-18 20:15:00 Katerina's Cafe 37.21 7819 L5259
206 2014-01-19 13:00:00 Katerina's Cafe 35.01 7354 L9254
207 2014-01-19 13:23:00 Katerina's Cafe 70.22 4795 L8566
208 2014-01-19 13:53:00 Katerina's Cafe 26.76 4434 L2169
209 2014-01-19 14:01:00 Katerina's Cafe 12.78 1310 <NA>
210 2014-01-19 14:20:00 Katerina's Cafe 54.20 3492 <NA>
211 2014-01-19 18:54:00 Katerina's Cafe 72.25 3547 <NA>
212 2014-01-19 19:53:00 Katerina's Cafe 34.03 7889 L6119
213 2014-01-19 20:05:00 Katerina's Cafe 33.79 1874 L4424
214 2014-01-19 20:22:00 Katerina's Cafe 38.90 9617 L5553
Brew’ve Been Served and Hallowed Grounds are all coffee shop and the peak hours are 7am-8am.
Actually, as loyalty card can be considered as the proof of personal identity, generally, one loyalty card can only be used by one person. And credit can only be owned or used by one person unless there is kinship between two users. Therefore, one credit card can only be related to one loyalty card. Once there are several loyalty cards records under the same credit card, such credit card should be labeled as abnormal one.
abnormal_credit_card <- popular_locations %>%
drop_na(loyaltynum) %>%
group_by(last4ccnum) %>%
summarize(loy_n = n_distinct(loyaltynum)) %>%
filter(loy_n > 1)
abnormal_credit_card
# A tibble: 7 x 2
last4ccnum loy_n
<int> <int>
1 1286 2
2 4795 2
3 4948 2
4 5368 2
5 5921 2
6 7889 2
7 8332 2
The abnormal credit cards are as shown before, which will be analyzed further in the following parts.
In this question, the abnormal records are required to combine with vehicle data to dig out more information. The data “car_assignment” is comparably simple which only include some basic information of employees and there is no connections between card information and car information. So the data of gps should be used in this question.
Firstly, we should load the map in the environment.
bgmap <- raster("data_MC2/MC2-tourist.tif")
bgmap
class : RasterLayer
band : 1 (of 3 bands)
dimensions : 1595, 2706, 4316070 (nrow, ncol, ncell)
resolution : 3.16216e-05, 3.16216e-05 (x, y)
extent : 24.82419, 24.90976, 36.04499, 36.09543 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : MC2-tourist.tif
names : MC2.tourist
values : 0, 255 (min, max)
Plot raster layer.
tmap_mode("plot")
tm_shape(bgmap) +
tm_raster(bgmap,
legend.show = FALSE)

tm_shape(bgmap) +
tm_rgb(bgmap, r = 1,g = 2,b = 3,
alpha = NA,
saturation = 1,
interpolate = TRUE,
max.value = 255)

Import vector GIS data file.
Abila_st <- st_read(dsn = "data_MC2/Geospatial",
layer = "Abila")
Reading layer `Abila' from data source
`F:\Visual Analysis\yining-ai\Makeover1\_posts\2021-07-13-assignment\data_MC2\Geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 3290 features and 9 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 24.82401 ymin: 36.04502 xmax: 24.90997 ymax: 36.09492
Geodetic CRS: WGS 84
In order to create relationship between credit card records and gps information, we need to find out the parking time and the driving time associated with each transaction. Here a new variable “Time Difference” is created to calculate the time interval.
gps2 <- read_csv("data_MC2/gps2.csv")
glimpse(gps2)
Rows: 685,169
Columns: 6
$ Timestamp <chr> "1/6/2014 7:20", "1/6/2014 7:20", "1/6/201~
$ id <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
$ lat <dbl> 36.06646, 36.06634, 36.06615, 36.06613, 36~
$ long <dbl> 24.88258, 24.88259, 24.88258, 24.88258, 24~
$ `Time Difference` <time> NA, 00:02:00, 00:03:00, 00:01:00, 0~
$ Seconds <dbl> 0, 2, 3, 1, 3, 1, 1, 1, 4, 1, 1, 2, 3, 1, ~
Change the datatype of variable Timestamp.
gps2$Timestamp <- date_time_parse(gps2$Timestamp,
zone = "",
format = "%m/%d/%Y %H:%M")
gps2$id <- as_factor(gps2$id)
To create the roadmap, longitude and latitude should be combined as a coordination point.
gps_sf <- st_as_sf(gps2,
coords = c("long", "lat"),
crs= 4326)
To facilitate filtering, variables “day”, “hour” and “minute” can be extracted.
Find out those stop time points whose time interval is longer than 3 minutes.
more_than_3mins <- gps_sf %>%
filter(Seconds >180)
gps_path <- gps_sf %>%
group_by(id, day, hour, minute) %>%
summarize(m = mean(Timestamp),
do_union=FALSE) %>%
st_cast("LINESTRING")
p = npts(gps_path, by_feature = TRUE)
gps_path2 <- cbind(gps_path, p)
gps_path2 <- gps_path2 %>%
filter(p>1)
Draw the roadmap. We can draw movement track at any hour and minute. This can help us to show the employees’ daily lives.
gps_path_selected <- gps_path2 %>%
filter(day == "06", hour == "08")
tmap_mode("view")
tm_shape(bgmap) +
tm_rgb(bgmap, r = 1,g = 2,b = 3,
alpha = NA,
saturation = 1,
interpolate = TRUE,
max.value = 255) +
tm_shape(gps_path_selected) +
tm_lines()
In order to find those abnormal points, point graph can be drawn to help build connection between car ID and transaction location. We can firstly select the abnormal credit cards’ information.
card_selected <- card_joined %>%
filter(last4ccnum == 1286)
gps_dot <- more_than_3mins %>%
group_by(id, hour, day, minute) %>%
summarize(geo_n = n_distinct(geometry)) %>%
st_cast("POINT")
We can draw anyone’s stop point on any day at any time. Then set the abnormal time to dig out the suspicious activities.
In this way, we can find the ownerships of credit cards.
abnormal_cc <- read_csv("data_MC2/abnormal_cc.csv")
abnormal_cc
# A tibble: 9 x 6
CC_number Loyalty_number Car_ID Name CurrentEmploymentTy~
<dbl> <chr> <chr> <chr> <chr>
1 1286 L3572 22 Adra Nubarron Security
2 1286 L3288 22 Adra Nubarron Security
3 4795 L8566 34 Edvard Vann Security
4 4948 L3295 18 Birgitta Frente Engineering
5 5921 L9406 29 Bertrand Ovan Facilities
6 5921 L3295 29 Bertrand Ovan Facilities
7 7889 L6119 8 Lucas Alcazar Information Technol~
8 7889 L2247 8/22/6 - -
9 8332 L2070 10 Ada Campo-Corr~ Executive
# ... with 1 more variable: CurrentEmploymentTitle <chr>
In this question, we can use the same method to draw the dot graph to recognize all the owners of credit cards as the Questions shown before. Now we can infer all the ownerships.
total_match <- read_csv("data_MC2/total_match.csv")
total_match
# A tibble: 44 x 6
CC_number Loyalty_number Car_ID Name CurrentEmploymentTy~
<dbl> <chr> <dbl> <chr> <chr>
1 9551 L5777 1 Nils Calixto Information Technol~
2 1415 L7783 2 Lars Azada Engineering
3 9635 L3191 3 Felix Balas Engineering
4 7688 L4164 4 Ingrid Barran~ Executive
5 6899 L6267 5 Isak Baza Information Technol~
6 7253 L1682 6 Linnea Bergen Information Technol~
7 2540 L5947 7 Elsa Orilla Engineering
8 1877 L3014 9 Gustav Cazar Engineering
9 1311 L4149 11 Axel Calzas Engineering
10 7108 L6544 12 Hideki Cocina~ Security
# ... with 34 more rows, and 1 more variable:
# CurrentEmploymentTitle <chr>
When compare the transaction records and gps information, we found that some vehicles were in the same geographical location at the same time. Such institution can be regarded as suspicious ones and to draw their
movement track to explore their relationship.
The first suspicious pair we find is card ID 33 and 7. The line graph and dot graph are dyrawn below. The two visited the Chostus Hotel several time at 13pm and visited restaurants at evening. And they even have some tracks in the same apartment. The owners of the cars are Elsa Orilla and Brand Tempestad seperately. And they have the same employment type and employment title.
The second suspect group are Car ID 22, 30 and 15. There is a lot of overlap in the tracks of these cars. They often buy coffee in Brew’ve been Served and often have lunch in the nearby restaurants. The owners of the three cars are Nubarron Adra, Loreto Bodrogi and Felix Resumir. These three employees are all security guards. Therefore, we can assume that they have comparably close relationship.
We draw dot plot at 2am, 3am, 4am, and 5am when generally very few people are outside. 15, 21, 24 and 16 are suspicious car id. There are several people gathering in the Frydo’s Autosupply N More. Someone even drive to parks far from the downtown center. Therefore, there are some suspicious activity locations: Spetson Park, Taxiarchon Park and Frydo’s Autosupply N More.